Skip to content

Automated Test: error-upsampling-race-condition #309

Closed
wants to merge 2 commits into from

Conversation

admin-coderabbit
Copy link
Owner

@admin-coderabbit admin-coderabbit commented Feb 4, 2026

This pull request was automatically created by @coderabbitai/e2e-reviewer.

Batch created pull request.

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced error upsampling capabilities for organizations with allowlisted projects, enabling improved accuracy in error event aggregation.
    • Added new aggregation function for precise error event counting with sample-weight-based upsampling.
  • Improvements

    • Refined error message for topEvents validation to be more concise and user-friendly.

Yuval Mandelboum and others added 2 commits July 25, 2025 09:48
…(#94376)

Part of the Error Upsampling project:
https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link

Events-stats API will now check if all projects in the query are
allowlisted for upsampling, and convert the count query to a sum over
`sample_weight` in Snuba, this is done by defining a new SnQL function
`upsampled_count()`.

I noticed there are also eps() and epm() functions in use in this
endpoint. I considered (and even worked on) also supporting
swapping eps() and epm() which for correctness should probably also not
count naively and use `sample_weight`, but this
caused some complications and since they are only in use by specific
dashboard widgets and not available in discover
I decided to defer changing them until we realize it is needed.
- Add 60-second cache for upsampling eligibility checks to improve performance
- Separate upsampling eligibility check from query transformation for better optimization
- Remove unnecessary null checks in upsampled_count() function per schema requirements
- Add cache invalidation utilities for configuration management

This improves performance during high-traffic periods by avoiding repeated
expensive allowlist lookups while maintaining data consistency.
@coderabbit-eval
Copy link

coderabbit-eval bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

This pull request introduces an error upsampling feature that automatically scales down-sampled error event counts back to their full scope. A new helper module with eligibility checks and query column transformation is integrated into organization event stats queries, with supporting infrastructure and comprehensive tests.

Changes

Cohort / File(s) Summary
Configuration & Build System
pyproject.toml, sentry-repo
Added mypy strict typing override for error_upsampling module and included test discovery for the new test suite; updated submodule reference.
Core Error Upsampling Module
src/sentry/api/helpers/error_upsampling.py
New module implementing eligibility checks (via per-org/project allowlist with 60-second cache), query column transformation (count() → upsampled_count() as count), and contextual gating based on dataset type and error-focused queries.
Query Processing Integration
src/sentry/api/endpoints/organization_events_stats.py, src/sentry/search/events/datasets/discover.py
Modified organization_events_stats to conditionally apply upsampling transformation across all query execution paths (top_events and timeseries); added upsampled_count SnQL function for sample_weight-based aggregation.
Event Storage Helpers
src/sentry/testutils/factories.py
Added helper to extract and propagate sample_rate from error_sampling context during event factory processing.
Test Infrastructure
tests/sentry/api/helpers/test_error_upsampling.py, tests/snuba/api/endpoints/test_organization_events_stats.py
New test module validating allowlist logic, column transformation, and error-focused query detection; integration tests covering allowlisted, partial, transaction, and non-allowlisted project scenarios.

Sequence Diagram

sequenceDiagram
    participant Request as Client<br/>(HTTP Request)
    participant Stats as organization_events_stats<br/>(_get_event_stats)
    participant Helper as error_upsampling<br/>(Helper Module)
    participant Discover as discover.py<br/>(Query Builder)
    participant Snuba as Snuba Backend

    Request->>Stats: GET /events/stats?query=...
    Stats->>Helper: is_errors_query_for_error_upsampled_projects()
    Helper->>Helper: Check org/project allowlist<br/>(with cache)
    Helper-->>Stats: upsampling_enabled: bool
    
    alt Upsampling Enabled
        Stats->>Helper: transform_query_columns_for_error_upsampling()
        Helper-->>Stats: final_columns with upsampled_count
        Stats->>Discover: Build query with final_columns
    else Upsampling Disabled
        Stats->>Discover: Build query with original columns
    end
    
    Discover->>Discover: Apply upsampled_count function<br/>via snql_aggregate
    Discover-->>Stats: Transformed SnQL query
    Stats->>Snuba: Execute query (timeseries/top_events)
    Snuba-->>Stats: Aggregated results
    Stats-->>Request: JSON response with counts
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hops through the upsampling dance,
Down-sampled counts get a second chance,
Allowlists checked with cache so spry,
Columns transformed before they fly—
Sample weights bloom in our query's span! 🌸

🚥 Pre-merge checks | ✅ 1 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The pull request title 'Automated Test: error-upsampling-race-condition' does not accurately reflect the substantive changes. The PR implements error upsampling feature including new helpers, query transformations, and integration tests, but the title suggests only an automated test for a race condition. Revise the title to accurately describe the main feature implementation, such as 'Implement error upsampling with eligibility checking and query transformation' or similar, reflecting the actual scope of changes across helpers, endpoints, and datasets.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch error-upsampling-race-condition

Comment @coderabbit eval help to get the list of available commands and usage tips.

@coderabbit-eval coderabbit-eval bot closed this Feb 4, 2026
Copy link

@coderabbit-eval coderabbit-eval bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In `@src/sentry/api/helpers/error_upsampling.py`:
- Around line 67-74: invalidate_upsampling_cache uses Python's unstable hash()
and requires the exact same project_ids set as the cache key, so replace the
unstable hash-based key with a stable representation (e.g., sorted join of
project IDs) or, better, implement prefix/version-based invalidation: either
build keys consistently the same way as the setter (use the same stable
serializer for the cache key in both the setter and invalidate_upsampling_cache)
or add a per-organization/version key (e.g., org_up_sampling_version) that you
bump when an allowlist changes and include that version in the stored key so
invalidate_upsampling_cache only needs to increment the org version (or delete
the prefix) instead of enumerating all project ID combinations; locate and
update both invalidate_upsampling_cache and the corresponding cache set call to
use the chosen stable/versioned scheme.
- Around line 130-140: _is_error_focused_query currently returns true for any
query containing the substring "event.type:error", which incorrectly matches
negated forms like "!event.type:error". Modify _is_error_focused_query to only
match non-negated tokens by using a token-aware check (e.g., a regex or parser)
that ensures "event.type:error" is a standalone token not preceded by negation
operators like "!" or "-" or the word "not" and respects token boundaries;
update the implementation in _is_error_focused_query to perform this safer match
so negated queries do not return True.
- Around line 43-64: The function _are_all_projects_error_upsampled accepts an
unused organization parameter and does an O(n×m) membership test; remove the
organization parameter from _are_all_projects_error_upsampled (and update its
callers accordingly) or use it to fetch an org-scoped allowlist, and convert the
allowlist to a set before the membership loop (e.g., allowlist_set =
set(allowlist)) so the all(project_id in allowlist_set for project_id in
project_ids) check is O(n); also keep the early-return behavior when project_ids
or allowlist is empty.
- Line 27: The cache key uses Python's process-randomized hash of
tuple(sorted(snuba_params.project_ids)), which breaks cross-process cache
consistency; replace that hashed value with a deterministic representation
(e.g., a canonical string of sorted project IDs or a stable digest via hashlib
on the joined IDs) when building cache_key in this module and do the same change
inside invalidate_upsampling_cache so both cache set and invalidate use the
identical deterministic key format (reference cache_key,
snuba_params.project_ids, organization.id, and the invalidate_upsampling_cache
function).

In `@tests/snuba/api/endpoints/test_organization_events_stats.py`:
- Around line 3609-3627: Wrap the request call in the test so it runs with the
same feature flag and auth context as other tests: use
self.feature({"organizations:discover-basic": True}) around the
self.client.get(...) invocation to enable the discover-basic feature, and update
the misleading inline comments near the assertions for data[0][1][0]["count"]
and data[1][1][0]["count"] to state they expect the upsampled value (10) instead
of "1 event"; refer to the test's use of self.client.get and the feature helper
to locate where to insert the wrapper and change the comments.
- Around line 3604-3607: The mock for
sentry.api.helpers.error_upsampling.options is too broad in
test_error_upsampling_with_allowlisted_projects (and the three other tests)
because mock_options.get.return_value makes every options.get() return the
project list; change the mock to use mock_options.get.side_effect that inspects
the requested key and returns [self.project.id, self.project2.id] only when key
== "issues.client_error_sampling.project_allowlist", otherwise delegate to the
real options.get (or return the original default) so only the intended option is
mocked; update all four test methods (including
test_error_upsampling_with_allowlisted_projects) to use this selective
side_effect.
- Around line 3569-3597: The exception field in the two test events sent via
store_event (the events with event_id "a"*32 and "b"*32) uses a bare list but
must be an object with a values key; change the exception payload to {"values":
[...]} for both occurrences so normalization picks up the exception data (update
the exception for the first store_event and the second store_event accordingly).
🧹 Nitpick comments (7)
src/sentry/testutils/factories.py (1)

344-358: Overly broad exception handling with silent pass suppresses potential bugs.

Two issues here:

  1. The first try/except (lines 347–352) wraps a chain of .get() calls with default {}. The only scenario this raises is if an intermediate value is not a mapping (e.g., "contexts" holds a string). Catching blind Exception and silently passing hides data shape issues that would be useful to surface in tests.

  2. if client_sample_rate: (line 353) is a truthy check — a legitimate client_sample_rate of 0 or 0.0 will be silently skipped. If the intent is to only skip None, use if client_sample_rate is not None:.

♻️ Suggested refactor
 def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None:
     """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid."""
-    client_sample_rate = None
-    try:
-        client_sample_rate = (
-            normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate")
-        )
-    except Exception:
-        pass
-    if client_sample_rate:
+    client_sample_rate = (
+        normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate")
+    )
+    if client_sample_rate is not None:
         try:
             normalized_data["sample_rate"] = float(client_sample_rate)
-        except Exception:
-            pass
+        except (TypeError, ValueError):
+            pass  # non-numeric sample rate, ignore

The .get() chain with {} defaults is inherently safe and doesn't need a try/except. The float() conversion only needs to guard against TypeError/ValueError, not all exceptions.

tests/sentry/api/helpers/test_error_upsampling.py (3)

19-35: Consider using self.create_organization() instead of direct ORM call.

Line 21 creates the organization via Organization.objects.create(name="test-org"), bypassing the factory's Factories.create_organization() which sets up required related objects (org slug reservation, org mapping, etc.). While this may work for the current tests since they only test helper logic without hitting the database for org lookups, it's inconsistent with the rest of the test suite and could cause issues if the tests evolve.

♻️ Suggested fix
     def setUp(self) -> None:
-        self.organization = Organization.objects.create(name="test-org")
+        self.organization = self.create_organization(name="test-org")
         self.projects = [

54-75: Missing test coverage for edge cases in column transformation.

The tests cover the basic count() case, case insensitivity, and whitespace, but consider adding tests for:

  • Multiple count() occurrences in the column list
  • count() with arguments like count(id) (should these be transformed?)
  • Columns that contain count() as a substring (e.g., count_unique())

These edge cases would help ensure the transformation regex/logic is precise.


90-101: Missing test for the main public API is_errors_query_for_error_upsampled_projects.

The tests cover private helper functions but don't test the primary public entry point is_errors_query_for_error_upsampled_projects, which is the function actually consumed by the endpoint. Consider adding an integration-style test that validates the composed behavior.

src/sentry/api/endpoints/organization_events_stats.py (2)

218-227: Unnecessary variable indirection and misleading comments.

upsampling_enabled (line 226) is just an alias for should_upsample (line 220) with no additional logic. The comments reference "caching" and "performance optimization" but there's no caching happening — it's a single function call assigned to a variable.

♻️ Suggested simplification
-            # Early upsampling eligibility check for performance optimization
-            # This cached result ensures consistent behavior across query execution
-            should_upsample = is_errors_query_for_error_upsampled_projects(
+            should_upsample = is_errors_query_for_error_upsampled_projects(
                 snuba_params, organization, dataset, request
             )
-            
-            # Store the upsampling decision to apply later during query building
-            # This separation allows for better query optimization and caching
-            upsampling_enabled = should_upsample
-            final_columns = query_columns
+            final_columns = (
+                transform_query_columns_for_error_upsampling(query_columns)
+                if should_upsample
+                else query_columns
+            )

This also eliminates the need to repeat the if upsampling_enabled: final_columns = transform_query_columns_for_error_upsampling(query_columns) block three times in each branch below (lines 232–233, 276–277, 295–296), since final_columns would already be set correctly.


229-272: Redundant transformation logic repeated in every branch.

The if upsampling_enabled: final_columns = transform_query_columns_for_error_upsampling(query_columns) pattern is repeated identically in all three mutually exclusive paths (lines 232–233 for top events, lines 276–277 for RPC, lines 295–296 for standard). Since all branches need the same transformation, compute final_columns once before the branching (as suggested above) and remove the per-branch duplication.

♻️ Remove per-branch duplication
             if top_events > 0:
-                # Apply upsampling transformation just before query execution
-                # This late transformation ensures we use the most current schema assumptions
-                if upsampling_enabled:
-                    final_columns = transform_query_columns_for_error_upsampling(query_columns)
-                    
                 if use_rpc:
                     return scoped_dataset.run_top_events_timeseries_query(

And similarly for the RPC and standard paths.

src/sentry/search/events/datasets/discover.py (1)

1041-1052: Use _ for the unused args parameter to match file conventions.

Other functions in this file (e.g., lines 182, 253, 271, 281, 290) use lambda _, alias: for unused lambda parameters. This function should follow the same pattern.

♻️ Proposed fix
                 SnQLFunction(
                     "upsampled_count",
                     required_args=[],
                     # Optimized aggregation for error upsampling - assumes sample_weight
                     # exists for all events in allowlisted projects as per schema design
-                    snql_aggregate=lambda args, alias: Function(
+                    snql_aggregate=lambda _, alias: Function(
                         "toInt64",
                         [Function("sum", [Column("sample_weight")])],
                         alias,
                     ),
                     default_result_type="number",
                 ),

The sample_weight column assumption is safe—callers are properly guarded by _should_apply_sample_weight_transform() and _are_all_projects_error_upsampled() which ensure the transform applies only to eligible error events.

expensive repeated option lookups during high-traffic periods. This is safe
because allowlist changes are infrequent and eventual consistency is acceptable.
"""
cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

hash() is not stable across Python processes — cache keys will diverge in multi-process deployments.

Python's built-in hash() for tuples is randomized per process (due to PYTHONHASHSEED). In a multi-worker deployment, each process will compute a different hash for the same set of project IDs, meaning:

  1. Cache hits will fail across workers, defeating the caching purpose.
  2. invalidate_upsampling_cache called from one process won't clear keys set by another.

Use a deterministic representation instead.

Proposed fix
-    cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"
+    project_key = ",".join(str(pid) for pid in sorted(snuba_params.project_ids))
+    cache_key = f"error_upsampling_eligible:{organization.id}:{project_key}"

Apply the same fix in invalidate_upsampling_cache (line 73):

-    cache_key = f"error_upsampling_eligible:{organization_id}:{hash(tuple(sorted(project_ids)))}"
+    project_key = ",".join(str(pid) for pid in sorted(project_ids))
+    cache_key = f"error_upsampling_eligible:{organization_id}:{project_key}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"
project_key = ",".join(str(pid) for pid in sorted(snuba_params.project_ids))
cache_key = f"error_upsampling_eligible:{organization.id}:{project_key}"
🤖 Prompt for AI Agents
In `@src/sentry/api/helpers/error_upsampling.py` at line 27, The cache key uses
Python's process-randomized hash of tuple(sorted(snuba_params.project_ids)),
which breaks cross-process cache consistency; replace that hashed value with a
deterministic representation (e.g., a canonical string of sorted project IDs or
a stable digest via hashlib on the joined IDs) when building cache_key in this
module and do the same change inside invalidate_upsampling_cache so both cache
set and invalidate use the identical deterministic key format (reference
cache_key, snuba_params.project_ids, organization.id, and the
invalidate_upsampling_cache function).

Comment on lines +43 to +64
def _are_all_projects_error_upsampled(
project_ids: Sequence[int], organization: Organization
) -> bool:
"""
Check if ALL projects in the query are allowlisted for error upsampling.
Only returns True if all projects pass the allowlist condition.
NOTE: This function reads the allowlist configuration fresh each time,
which means it can return different results between calls if the
configuration changes during request processing. This is intentional
to ensure we always have the latest configuration state.
"""
if not project_ids:
return False

allowlist = options.get("issues.client_error_sampling.project_allowlist", [])
if not allowlist:
return False

# All projects must be in the allowlist
result = all(project_id in allowlist for project_id in project_ids)
return result

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

organization parameter is unused; allowlist lookup is O(n×m) on lists.

Two issues here:

  1. The organization parameter is accepted but never referenced (confirmed by static analysis). Either remove it or use it for the allowlist lookup.
  2. project_id in allowlist on a list is O(m) per call, making the all(...) check O(n×m). Convert to a set for O(n) total.
Proposed fix
 def _are_all_projects_error_upsampled(
-    project_ids: Sequence[int], organization: Organization
+    project_ids: Sequence[int],
 ) -> bool:
     if not project_ids:
         return False
 
     allowlist = options.get("issues.client_error_sampling.project_allowlist", [])
     if not allowlist:
         return False
 
+    allowlist_set = set(allowlist)
     # All projects must be in the allowlist
-    result = all(project_id in allowlist for project_id in project_ids)
+    result = all(project_id in allowlist_set for project_id in project_ids)
     return result

If organization is removed here, also update the caller on line 35:

-    is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization)
+    is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids)
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 44-44: Unused function argument: organization

(ARG001)

🤖 Prompt for AI Agents
In `@src/sentry/api/helpers/error_upsampling.py` around lines 43 - 64, The
function _are_all_projects_error_upsampled accepts an unused organization
parameter and does an O(n×m) membership test; remove the organization parameter
from _are_all_projects_error_upsampled (and update its callers accordingly) or
use it to fetch an org-scoped allowlist, and convert the allowlist to a set
before the membership loop (e.g., allowlist_set = set(allowlist)) so the
all(project_id in allowlist_set for project_id in project_ids) check is O(n);
also keep the early-return behavior when project_ids or allowlist is empty.

Comment on lines +67 to +74
def invalidate_upsampling_cache(organization_id: int, project_ids: Sequence[int]) -> None:
"""
Invalidate the upsampling eligibility cache for the given organization and projects.
This should be called when the allowlist configuration changes to ensure
cache consistency across the system.
"""
cache_key = f"error_upsampling_eligible:{organization_id}:{hash(tuple(sorted(project_ids)))}"
cache.delete(cache_key)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cache invalidation key must match the key used for cache set — both have the same hash() instability issue.

This function has the same hash() instability problem as noted on line 27. Additionally, this function is only useful if called with the exact same set of project IDs that was used in the original query. In practice, allowlist changes affect individual projects, so the caller would need to know all possible project ID combinations that were cached — which is impractical.

Consider whether a prefix-based invalidation or a versioned approach would be more robust.

🤖 Prompt for AI Agents
In `@src/sentry/api/helpers/error_upsampling.py` around lines 67 - 74,
invalidate_upsampling_cache uses Python's unstable hash() and requires the exact
same project_ids set as the cache key, so replace the unstable hash-based key
with a stable representation (e.g., sorted join of project IDs) or, better,
implement prefix/version-based invalidation: either build keys consistently the
same way as the setter (use the same stable serializer for the cache key in both
the setter and invalidate_upsampling_cache) or add a per-organization/version
key (e.g., org_up_sampling_version) that you bump when an allowlist changes and
include that version in the stored key so invalidate_upsampling_cache only needs
to increment the org version (or delete the prefix) instead of enumerating all
project ID combinations; locate and update both invalidate_upsampling_cache and
the corresponding cache set call to use the chosen stable/versioned scheme.

Comment on lines +130 to +140
def _is_error_focused_query(request: Request) -> bool:
"""
Check if a query is focused on error events.
Reduced to only check for event.type:error to err on the side of caution.
"""
query = request.GET.get("query", "").lower()

if "event.type:error" in query:
return True

return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

_is_error_focused_query matches negated queries, returning a false positive.

A query like !event.type:error (which explicitly excludes errors) contains the substring "event.type:error" and will incorrectly return True. This would apply the upsampling transform to a non-error query, producing wrong results.

Proposed fix
 def _is_error_focused_query(request: Request) -> bool:
     """
     Check if a query is focused on error events.
-    Reduced to only check for event.type:error to err on the side of caution.
+    Only matches positive event.type:error, not negated queries.
     """
     query = request.GET.get("query", "").lower()
 
-    if "event.type:error" in query:
+    # Check for event.type:error but not !event.type:error
+    if "event.type:error" in query and "!event.type:error" not in query:
         return True
 
     return False
🤖 Prompt for AI Agents
In `@src/sentry/api/helpers/error_upsampling.py` around lines 130 - 140,
_is_error_focused_query currently returns true for any query containing the
substring "event.type:error", which incorrectly matches negated forms like
"!event.type:error". Modify _is_error_focused_query to only match non-negated
tokens by using a token-aware check (e.g., a regex or parser) that ensures
"event.type:error" is a standalone token not preceded by negation operators like
"!" or "-" or the word "not" and respects token boundaries; update the
implementation in _is_error_focused_query to perform this safer match so negated
queries do not return True.

Comment on lines +3569 to +3597
# Store some error events with error_sampling context
self.store_event(
data={
"event_id": "a" * 32,
"message": "very bad",
"type": "error",
"exception": [{"type": "ValueError", "value": "Something went wrong 1"}],
"timestamp": (self.day_ago + timedelta(minutes=1)).isoformat(),
"fingerprint": ["group1"],
"tags": {"sentry:user": self.user.email},
"contexts": {"error_sampling": {"client_sample_rate": 0.1}},
},
project_id=self.project.id,
)
self.store_event(
data={
"event_id": "b" * 32,
"message": "oh my",
"type": "error",
"exception": [{"type": "ValueError", "value": "Something went wrong 2"}],
"timestamp": (self.day_ago + timedelta(hours=1, minutes=1)).isoformat(),
"fingerprint": ["group2"],
"tags": {"sentry:user": self.user2.email},
"contexts": {"error_sampling": {"client_sample_rate": 0.1}},
},
project_id=self.project2.id,
)
self.wait_for_event_count(self.project.id, 1)
self.wait_for_event_count(self.project2.id, 1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for wait_for_event_count method definition
rg -n "def wait_for_event_count" --type=py

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 186


🏁 Script executed:

#!/bin/bash
# Check the base classes of the test file
head -100 tests/snuba/api/endpoints/test_organization_events_stats.py | grep -E "class.*\(|^import|^from"

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 1177


🏁 Script executed:

#!/bin/bash
# Look for exception field usage patterns in similar test files
rg -n '"exception"\s*:\s*\[' --type=py tests/snuba/api/endpoints/ -A 2 | head -50

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 912


🏁 Script executed:

#!/bin/bash
# Also search for exception format with "values" key
rg -n '"exception"\s*:\s*\{.*"values"' --type=py tests/snuba/api/endpoints/ -A 1 | head -50

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 1023


🏁 Script executed:

#!/bin/bash
# Check the SnubaTestCase and APITestCase class definitions
rg -n "^class (SnubaTestCase|APITestCase)" --type=py src/sentry/testutils/cases.py -A 5

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 887


🏁 Script executed:

#!/bin/bash
# Verify the inheritance of wait_for_event_count method - which class it belongs to
rg -B 10 "def wait_for_event_count" src/sentry/testutils/cases.py | head -20

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 484


🏁 Script executed:

#!/bin/bash
# Get more context to see which class wait_for_event_count belongs to
sed -n '1000,1090p' src/sentry/testutils/cases.py | grep -E "^class|def wait_for_event_count" -B 20 | tail -25

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 1020


🏁 Script executed:

#!/bin/bash
# Check what BaseTestCase is and whether it has wait_for_event_count
rg -n "^class BaseTestCase" --type=py src/sentry/testutils/cases.py -A 3

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 242


🏁 Script executed:

#!/bin/bash
# Verify exception format by checking how load_data or other utilities format exceptions
rg -n "load_data\(" tests/snuba/api/endpoints/test_organization_events_stats.py -B 2 -A 5 | head -40

Repository: admin-coderabbit/coderabbit-sentry-mirror

Length of output: 2133


Fix exception field schema in event data.

The exception field at lines 3575 and 3588 uses an incorrect schema. It should be wrapped in an object with a values key: {"values": [{"type": ..., "value": ...}]}, not a bare list. The bare list format may be silently ignored during event normalization, resulting in events stored without proper exception data.

Proposed fix
-                "exception": [{"type": "ValueError", "value": "Something went wrong 1"}],
+                "exception": {"values": [{"type": "ValueError", "value": "Something went wrong 1"}]},

Apply the same fix for the second event at line 3588.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Store some error events with error_sampling context
self.store_event(
data={
"event_id": "a" * 32,
"message": "very bad",
"type": "error",
"exception": [{"type": "ValueError", "value": "Something went wrong 1"}],
"timestamp": (self.day_ago + timedelta(minutes=1)).isoformat(),
"fingerprint": ["group1"],
"tags": {"sentry:user": self.user.email},
"contexts": {"error_sampling": {"client_sample_rate": 0.1}},
},
project_id=self.project.id,
)
self.store_event(
data={
"event_id": "b" * 32,
"message": "oh my",
"type": "error",
"exception": [{"type": "ValueError", "value": "Something went wrong 2"}],
"timestamp": (self.day_ago + timedelta(hours=1, minutes=1)).isoformat(),
"fingerprint": ["group2"],
"tags": {"sentry:user": self.user2.email},
"contexts": {"error_sampling": {"client_sample_rate": 0.1}},
},
project_id=self.project2.id,
)
self.wait_for_event_count(self.project.id, 1)
self.wait_for_event_count(self.project2.id, 1)
# Store some error events with error_sampling context
self.store_event(
data={
"event_id": "a" * 32,
"message": "very bad",
"type": "error",
"exception": {"values": [{"type": "ValueError", "value": "Something went wrong 1"}]},
"timestamp": (self.day_ago + timedelta(minutes=1)).isoformat(),
"fingerprint": ["group1"],
"tags": {"sentry:user": self.user.email},
"contexts": {"error_sampling": {"client_sample_rate": 0.1}},
},
project_id=self.project.id,
)
self.store_event(
data={
"event_id": "b" * 32,
"message": "oh my",
"type": "error",
"exception": {"values": [{"type": "ValueError", "value": "Something went wrong 2"}]},
"timestamp": (self.day_ago + timedelta(hours=1, minutes=1)).isoformat(),
"fingerprint": ["group2"],
"tags": {"sentry:user": self.user2.email},
"contexts": {"error_sampling": {"client_sample_rate": 0.1}},
},
project_id=self.project2.id,
)
self.wait_for_event_count(self.project.id, 1)
self.wait_for_event_count(self.project2.id, 1)
🤖 Prompt for AI Agents
In `@tests/snuba/api/endpoints/test_organization_events_stats.py` around lines
3569 - 3597, The exception field in the two test events sent via store_event
(the events with event_id "a"*32 and "b"*32) uses a bare list but must be an
object with a values key; change the exception payload to {"values": [...]} for
both occurrences so normalization picks up the exception data (update the
exception for the first store_event and the second store_event accordingly).

Comment on lines +3604 to +3607
@mock.patch("sentry.api.helpers.error_upsampling.options")
def test_error_upsampling_with_allowlisted_projects(self, mock_options):
# Set up allowlisted projects
mock_options.get.return_value = [self.project.id, self.project2.id]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Overly broad mock — all options.get() calls will return project IDs list.

mock_options.get.return_value = [self.project.id, self.project2.id] replaces every options.get() call within the helper module, not just the one for "issues.client_error_sampling.project_allowlist". If any other option is read in the code path, it will also receive this list, potentially causing subtle misbehavior.

Use side_effect to selectively mock only the relevant key:

Proposed fix
 `@mock.patch`("sentry.api.helpers.error_upsampling.options")
 def test_error_upsampling_with_allowlisted_projects(self, mock_options):
-    mock_options.get.return_value = [self.project.id, self.project2.id]
+    def mock_get(key, default=None):
+        if key == "issues.client_error_sampling.project_allowlist":
+            return [self.project.id, self.project2.id]
+        from sentry import options as real_options
+        return real_options.get(key, default)
+    mock_options.get.side_effect = mock_get

This applies to all four test methods in this class.

🤖 Prompt for AI Agents
In `@tests/snuba/api/endpoints/test_organization_events_stats.py` around lines
3604 - 3607, The mock for sentry.api.helpers.error_upsampling.options is too
broad in test_error_upsampling_with_allowlisted_projects (and the three other
tests) because mock_options.get.return_value makes every options.get() return
the project list; change the mock to use mock_options.get.side_effect that
inspects the requested key and returns [self.project.id, self.project2.id] only
when key == "issues.client_error_sampling.project_allowlist", otherwise delegate
to the real options.get (or return the original default) so only the intended
option is mocked; update all four test methods (including
test_error_upsampling_with_allowlisted_projects) to use this selective
side_effect.

Comment on lines +3609 to +3627
# Test with count() aggregation
response = self.client.get(
self.url,
data={
"start": self.day_ago.isoformat(),
"end": (self.day_ago + timedelta(hours=2)).isoformat(),
"interval": "1h",
"yAxis": "count()",
"query": "event.type:error",
"project": [self.project.id, self.project2.id],
},
format="json",
)

assert response.status_code == 200, response.content
data = response.data["data"]
assert len(data) == 2 # Two time buckets
assert data[0][1][0]["count"] == 10 # First bucket has 1 event
assert data[1][1][0]["count"] == 10 # Second bucket has 1 event

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing feature flag and authentication context; misleading inline comments.

  1. Unlike other test classes in this file, this test calls self.client.get() directly without wrapping in self.feature(...). If the endpoint requires organizations:discover-basic (as the other test classes assume), these tests may fail or behave differently.

  2. The inline comments on lines 3626-3627 say "First bucket has 1 event" but the assertion checks for count == 10 (the upsampled value). The comments should reflect the expected upsampled count.

Proposed fix for comments
-        assert data[0][1][0]["count"] == 10  # First bucket has 1 event
-        assert data[1][1][0]["count"] == 10  # Second bucket has 1 event
+        assert data[0][1][0]["count"] == 10  # 1 event upsampled by 1/0.1 = 10
+        assert data[1][1][0]["count"] == 10  # 1 event upsampled by 1/0.1 = 10

Consider wrapping the request in self.feature({"organizations:discover-basic": True}) for consistency.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Test with count() aggregation
response = self.client.get(
self.url,
data={
"start": self.day_ago.isoformat(),
"end": (self.day_ago + timedelta(hours=2)).isoformat(),
"interval": "1h",
"yAxis": "count()",
"query": "event.type:error",
"project": [self.project.id, self.project2.id],
},
format="json",
)
assert response.status_code == 200, response.content
data = response.data["data"]
assert len(data) == 2 # Two time buckets
assert data[0][1][0]["count"] == 10 # First bucket has 1 event
assert data[1][1][0]["count"] == 10 # Second bucket has 1 event
# Test with count() aggregation
response = self.client.get(
self.url,
data={
"start": self.day_ago.isoformat(),
"end": (self.day_ago + timedelta(hours=2)).isoformat(),
"interval": "1h",
"yAxis": "count()",
"query": "event.type:error",
"project": [self.project.id, self.project2.id],
},
format="json",
)
assert response.status_code == 200, response.content
data = response.data["data"]
assert len(data) == 2 # Two time buckets
assert data[0][1][0]["count"] == 10 # 1 event upsampled by 1/0.1 = 10
assert data[1][1][0]["count"] == 10 # 1 event upsampled by 1/0.1 = 10
🤖 Prompt for AI Agents
In `@tests/snuba/api/endpoints/test_organization_events_stats.py` around lines
3609 - 3627, Wrap the request call in the test so it runs with the same feature
flag and auth context as other tests: use
self.feature({"organizations:discover-basic": True}) around the
self.client.get(...) invocation to enable the discover-basic feature, and update
the misleading inline comments near the assertions for data[0][1][0]["count"]
and data[1][1][0]["count"] to state they expect the upsampled value (10) instead
of "1 event"; refer to the test's use of self.client.get and the feature helper
to locate where to insert the wrapper and change the comments.

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant